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Can one understand the statistics of wins and losses of baseball teams? Are their consecutive- 
game winning and losing streaks self-reinforcing or can they be described statistically? We apply the 
Bradley- Terry model, which incorporates the heterogeneity of team strengths in a minimalist way, to 
answer these questions. Excellent agreement is found between the predictions of the Bradley- Terry 
model and the rank dependence of the average number team wins and losses in major-league baseball 
over the past century when the distribution of team strengths is taken to be uniformly distributed 
over a finite range. Using this uniform strength distribution, we also find very good agreement 
between model predictions and the observed distribution of consecutive-game team winning and 
losing streaks over the last half-century; however, the agreement is less good for the previous half- 
century. The behavior of the last half-century supports the hypothesis that long streaks are primarily 
statistical in origin with little self-reinforcing component. The data further show that the past half- 
century of baseball has been more competitive than the preceding half-century. 
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I. INTRODUCTION 

The physics of systems involving large numbers of in- 
teracting agents is currently a thriving field of research 
One of its many appeals lies in the opportunity 
it offers to apply precise methods and tools of physics 
to the realm of "soft" science. In this respect, biolog- 
ical, economic, and a large variety of human systems 
present many examples of competitive dynamics that can 
be studied qualitatively or even quantitatively by statis- 
tical physics. Among them, sports competitions are par- 
ticularly appealing because of the large amount of data 
available, their popularity, and the fact that they con- 
stitute almost perfectly isolated systems. Indeed, most 
systems considered in econophysics ^ or evolutionary 
biology 1^ are strongly affected by external and often 
unpredictable factors. For instance, a financial model 
cannot predict the occurrence of wars or natural dis- 
asters which dramatically affect financial markets, nor 
can it include the effect of many other important exter- 
nal parameters (China's GDP growth, German exports, 
Google's profit...). On the other hand, sport leagues 
(soccer @, baseball ||, football @. . . ) or tournaments 
(basketball 0, ^ , poker |^ . . . ) are basically isolated sys- 
tems that are much less sensitive to external influences. 
Hence, despite their intrinsic human nature, which ac- 
tually contribute to their appeal, competitive sports are 
particularly suited to quantitative theoretical modeling. 
In this spirit, this work is focused on basic statistical 
features of game outcomes in Major-League baseball. 

In Major-League baseball and indeed in any competi- 
tive sport, the main observable is the outcome of a single 
game — who wins and who loses. Then at the end of 
a season, the win/loss record of each team is fundamen- 
tal. As statistical physicists, we are not concerned with 
the fates of individual teams, but rather with the aver- 



age win/loss record of the 1*^*, 2'^'*, 3'''*, etc. teams, as 
well as the statistical properties of winning and losing 
streaks. We concentrate on major-league baseball to il- 
lustrate statistical properties of game outcomes because 
of the large amount of available data and the near 
constancy of the game rules during the so-called "modern 
era" that began in 1901. 

For non-US readers or for non-baseball fans, during the 
modern era of major-league baseball, teams have been 
divided into the nearly-independent American and Na- 
tional leagues |l^ . At the end of each season a champion 
of the American and National leagues is determined (by 
the best team in each league prior to 1961 and by league 
playoffs subsequently) that play in the World Series to 
determine the champion. As the data will reveal, it is 
also useful to separate the 1901-1960 early modern era, 
with a 154-game season and 16 teams, and the 1961-2005 
expansion era, with a 162-game season in which the num- 
ber of teams expanded in stages to its current value of 
30, to highlight systematic differences between these two 
periods. Our data is based on the 163674 regular-season 
games that have occurred between 1901 and the end of 
the 2005 season (72741 between 1901-60 and 90933 be- 
tween 1961-2005). 

While the record of each team can change significantly 
from year to year, we find that the time average win/loss 
record of the r"^-ranked team as a function of rank r is 
strikingly regular. One of our goals is to understand the 
rank dependence of this win fraction. An important out- 
come of our study is that the Bradley- Terry (BT) compe- 
tition model |]l^, |l3j provides an excellent account of the 
team win/loss records. This agreement between the data 
and theory is predicated on using a specific form for the 
distribution of team strengths. We will argue that the 
best match to the data is achieved by using a uniform 
distribution of teams strengths in each season. 
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Another goal of this work is to understand the sta- 
tistical features of consecutive-game team winning and 
losing streaks. The existence of long streaks of all types 
of exceptional achievement in baseball, as well as in most 
competitive sports, have been well documented [ p^ and 
continue to be the source of analysis and debate among 
sports fans. For long consecutive-game team winning and 
team losing streaks, an often-invoked theme is the no- 
tion of reinforcement — a team that is "on a roll" is more 
likely to continue winning, and vice versa for a slump- 
ing team on a losing streak. The question of whether 
streaks are purely statistical or self reinforcing contin- 
ues to be vigorously debated Using the BT model 
and our inferred uniform distribution of team strengths, 
we compute the streak length distribution. We find that 
the theoretical prediction agrees extremely well with the 
streak data during 1961-2005. However, there is a slight 
discrepancy between theory and the tail of the streak dis- 
tribution during 1901-60, suggesting that non-statistical 
effects may have played a role during this early period. 

As a byproduct of our study, we find clear evidence 
that baseball has been more competitive during 1961- 
2005 than during 1901-60 and feature that has been 
found previously |Tq| . The manifestation of this increased 
competitiveness is that the range of team records and the 
length of streaks was narrower during the latter period. 
This observation fits with the general principle [|l^ that 
outliers become progressively rarer in a highly compet- 
itive environment. Consequently, extremes of achieve- 
ment become less and less likely to occur. 



II. STATISTICS OF THE WIN FRACTION 

A. Bradley- Terry Model 

Our starting point to account for the win/loss records 
of all baseball teams is the BT model [|l^, |l^ that incor- 
porates the heterogeneity in team strengths in a natural 
and simple manner. We assume that each team has an 
intrinsic strength Xi that is fixed for each season. The 
probability that a team of strength Xi wins when it plays 
a team of strength Xj is simply 



Xi 



(1) 



Thus the winning probability depends continuously on 
the strengths of the two competing teams [|l8| . When two 
equal-strength teams play, each team has a 50% proba- 
bility to win, while if one team is much stronger, then its 
winning probability approaches 1. 

The form of the winning probability of Eq. is quite 
general. Indeed, we can replace the team strength Xi by 
any monotonic function f{xi). The only indispensable 
attribute is the ordering of the team strengths. Thus the 
notion of strength is coupled to the assumed form of the 
winning probability. If we make a hypothesis about one 



of these quantities, then the other is no longer a vari- 
able that we are free to choose, but an outcome of the 
model. In our analysis, we adopt the form of the winning 
probability in Eq. (|l]) because of its simplicity. Then the 
only relevant unknown quantity is the probability distri- 
bution of the Xi's. As we shall see in the next section, 
this distribution of team strengths can then be inferred 
from the season-end win/loss records of the teams, and 
a good fit to the data is obtained when assuming a uni- 
form distribution of team strengths. Because only the 
ratio of team strengths is relevant in Eq. (l|), we there- 
fore take team strengths to be uniformly distributed in 
the range [xmin,!], with < a;inin < 1- Thus the only 
model parameter is the value of Xmin- 

For uniformly distributed team strengths {xj} that lie 
in [xniin , 1] , the average winning fraction for a team of 
strength x that plays a large number of games N, with 
equal frequencies against each opponent is 
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where we assume N ^ oo in the second line. We then 
transform from strength x to scaled rank r hy x = Xmin + 
(1 — Xniin)'", with r = 0, 1 corresponding to the weakest 
and strongest team, respectively (Fig. |^). This result for 
the win fraction is one of our primary results. 
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FIG. 1: Average win fraction W{r) versus scaled rank r 
for 1901-60 (A) and 1961-2005 (o). For these periods, the 
dashed lines are simulation results for the BT model with 
a;min = 0.278 and 0.435 respectively. The solid curves rep- 
resent Eq. (P), corresponding to simulations for an infinitely 
long season and an infinite number of teams. 

To check the prediction of Eq. (J^, we start with a 
value of Xniin and simulate 10* periods of a model base- 
ball league that consists of: (i) 16 teams that play 60 
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FIG. 2: Convergence of W{r) versus scaled rank r as a func- 
tion of season length for 1961-2005, using Xmin ~ 0.435 and 
30 teams. The circles and the thick dashed curve are the 
baseball data and the corresponding BT model data for a 
n — 162 game season. The thin dashed lines are model data 
for a season of n = 300, 500, and 1000 games averaged over 
100000 seasons. The full line corresponds to the model for an 
infinitely long season with 30 teams. Finally, the + symbols 
give the result of Eq. (^, which corresponds to an infinite- 
length season and an infinite number of teams. 



seasons of 154 games (corresponding to 1901-60) and (ii) 
30 teams that play 45 seasons of 162 games (1961-2005), 
with uniformly distributed strengths in [smin, 1] for both 
cases, but with different values of Xmin- Using the win- 
ning probability pij of Eq. (^, we then compute the av- 
erage win fraction W{r) of each team as function of its 
scaled rank r. We then incrementally update the value 
of Xmin to minimize the difference between the simu- 
lated values of W{r) with those from game win/loss data. 
Nearly the same results are found if each team plays every 
opponent with equal probability or equally often, as long 
as the number of teams and number of games is not unre- 
alistically small. The BT model, with each team playing 
each opponent with the same probability, gives very good 
fits to the data by choosing Xmin — 0.278 for the period 
1901-60, and x^^in = 0.435 for 1961-2005 (Fig. |). If 
the actual game frequencies in each season are used to 
determine opponents, Xmin changes slightly — to 0.289 for 
1901-60 — but remains unchanged for 1961-2005. 

Despite the fact that the number of teams has in- 
creased from 16 to 30 since in 1961, the range of win 
fractions is larger in the early era (0.32-0.67) than in the 
expansion era (0.36-0.63), a feature that indicates that 
baseball has become more competitive. This observation 
accords with the notion that the pressure of continuous 
competition, as in baseball, gradually diminishes the like- 
lihood of outHers ||l^ . Given the crudeness of the model 
and real features that we have ignored, such as home-field 
advantage (approximately 53% for the past century and 
slowly decreasing with time), imbalanced playing sched- 
ules, and in-season personnel changes due to trades and 



player injuries, the agreement between the data and sim- 
ulations of the BT model is satisfying. 

It is worth noting in Fig. ^ is that the win fraction 
data and the corresponding numerical results from sim- 
ulations of the BT model deviate from the theoretical 
prediction given in Eq. (||) when r — > and r ^ 1. This 
discrepancy is simply a finite-season effect. As shown in 
Fig. g, when we simulate the BT model for progressively 
longer seasons, the win/loss data gradually converges to 
the prediction of Eq. (||). 

The present model not only reproduces the average 
win record W{r) over a given period, but it also correctly 
explains the season-to-season fluctuation (r) of the win 
fraction defined as 



(3) 



where Wj{r) is the winning fraction of the r*'^-ranked 
team during the j*'^ season and 



1 Y 



is the average win fraction of the r"^-ranked team and Y 
is the number of years in the period. These fiuctuations 
are the largest for extremal teams (and minimal for aver- 
age teams). There is also an asymmetry of cr(r) with re- 
spect to r = 1/2. Our simulations of the BT model with 
the optimal x^i^ values that were determined previously 
by fitting to the win fraction quantitatively reproduce 
these two features of a{r). 
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FIG. 3: Season-to-season fluctuation a{r) for 1901-60 (A) 
and for 1961-2005 (o). The dashed lines are numerical simu- 
lations of the BT model for 10'' 
as in Fig. nl. 



periods with the 



In addition to the finite-season effects described above, 
another basic consequence of the finiteness of the season 
is that the intrinsically strongest team does not necessar- 
ily have the best win/loss record. That is, the average 
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win fraction W does not necessarily increase with team 
strength. By luck, a strong team can have a poor record 
or vice versa. It is instructive to estimate the number 
of games G that need to be played to ensure that the 
win/loss record properly reflects team strength. The dif- 
ference in the number of wins of two adjacent teams in the 
standings is proportional to Gx {l — x^in)/T, namely, the 
number of games times their strength difference; the lat- 
ter is proportional to (1 — Xjnin)/T for a league that con- 
sists of T teams. This systematic contribution to the dif- 
ference should significantly exceed random fluctuations, 
which are of the order of \/G- Thus we require 

G»(-^—X (4) 

\ ^min / 

for the end-of-season standings to be ordered by team 
strength. Fig. |2| and Fig. ^ illustrate the fact that this 
effect is more important for the top-ranked and bottom- 
ranked teams. During the 1901-60 period, when major- 
league baseball consisted of independent American and 
National leagues, T = 8, G = 154, and Xmin ~ 0.3, so 
that the season was just long enough to resolve adjacent 
teams. Currently, however, the season length is insuffi- 
cient to resolve adjacent teams. The natural way to deal 
with this ambiguity is to expand the number of teams 
that qualify for the post-season playoffs, which is what is 
currently done. 

B. Applicability of the Bradley- Terry Model 



where Wij is the number of wins of team i against j, 
and Gij is the number of game they played against each 
other in a given season. If seasons were infinitely long, 
then Zij pij/{l — Pij), and hence 

Zik'XZkj = Zy. (8) 
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FIG. 4; Comparison of the detailed balanced relation Eq. (^) 
for baseball data to the results of the BT model over 10* 
periods (dashed lines), where each period corresponds to the 
results of all baseball games during either 1901-60 (triangles) 
or 1960-2005 (circles). The a;min values are the same as in 
Fig. |l|. The straight lines are guides for the eye, with slope 
0.63 for the data for 1901-60 and 0.30 for 1961-2005. 



Does the BT model with uniform teams strength pro- 
vide the most appropriate description of the win/loss 
data? We perform several tests to validate this model. 
First, as mentioned in the previous section, the assump- 
tion (|l|) for the winning probability can be recast more 
generally as 



Pi] = 



f{Xi) 



fix,) + fix,) 



(5) 



so that an arbitrary Xi — fixi) reduces to the orig- 
inal winning probability in Eq. (|^). Hence the cru- 
cial model assumption is the separability of the winning 
probability. In particular, the BT model assumes that 
Pij/pji = — Pij) is only a function of characteris- 

tics of team i, divided by characteristics of team j. One 
consequence of this separability is the "detailed-balance" 
relation 
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(6) 



for any triplet of teams. This relation quantifies the obvi- 
ous fact that if team A likely beats B, and B likely beats 
G, then A is likely to beat G. Since we do not know the 
actual Pij in a given baseball season, we instead consider 



G, 



(7) 
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FIG. 5: Dependence of {\n{zikZkj)) vs (ln(«ij)) on season 
length for the 1961-2005 period. All Gij's are multiplied by 
M — 5, 10, 100 (steepening dot-dashed lines). The thick 
dashed line corresponds to M = 10* and is indistinguishable 
from a linear dependence with unit slope. 

To test the detailed balance relation Eq. (^), we plot 
{\n{zikZkj)) as a function of {hi{zij)) from game data, 
averaged over all team triplets {i,j,k) and all seasons 
in a given period (Fig. Q). We discard events for which 
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Wij = Gij or Wij = (team i won or lost all games 
against team j). Our simulations of the BT model over 
10'* realizations of the 1901-60 and 1961-2005 periods 
with the same Gij as in actual baseball seasons and with 
the optimal values of a;min for each period are in excel- 
lent agreement with the game data. Although ZikZkj in 
the figure has a sublinear dependence of Zij (slope much 
less than 1 in Fig. U), the slope progressively increases 
and ultimately approaches the expected linear relation 
between ZikZ^j and as the season length is increased 
(Fig. H). We implement an increased season length by 
multiplying all the Gij by the same factor M. Notice 
also that {\n{zikZkj)) versus {\ia{zij)) for the 1901-60 pe- 
riod has a larger slope than for 1961-2005 because the 
Gy 's are larger in the former period [Gij — 22) than in 
the latter {Gij in the range 5-19). 

This study of game outcomes among triplets of teams 
provides a detailed and non-trivial validation for the BT 
form Eq. (H) for the winning probability. As a byproduct, 
we learn that cyclic game outcomes, in which team A 
beats B, B beats C, and C beats A, are unlikely to occur. 

C. Distribution of Team Strengths 

Thus far, we have used a uniform distribution of team 
strengths to derive the average win fraction for the BT 
model. We now determine the most likely strength dis- 
tribution by searching for the distribution that gives the 
best fit to the game data for W{r) by minimizing the 
deviation A between the data and the simulated form of 
W{r). Here the deviation A is defined as 

j:Mir)-Wir;p)]^ 

where W(r; p) is the winning fraction in simulations of 
the BT model for a trial distribution p{x) in which the 
actual game frequencies Gij were used in the simulation, 
and W{r) is the game data for the winning fraction. 

We assume that the two periods 1901-60 and 1961- 
2005 are long enough for W{r) to converge to its average 
value. We parameterize the trial strength distribution as 
a piecewise linear function of n points, {p{yi)}, with yi G 
[0,1] and yn = 1. We then perform Monte Carlo (MC) 
simulations, in which we update the yi and pi = p{yi) by 
small amounts in each step to reduce A. Specifically, at 
each MC step, we select one value of i = 1, n, and 

• with probability 1/2 adjust yi (except y„ ~ 1) by 
±u5y/10, where 5y is the spacing between yi and 
its nearest neighbor, and u is a uniform random 
number between and 1; 

• with probability 1/2, update p{yi) by ±u p{yi) /IQ. 

If A decreases as a result of this update, then yi or p{yi) 
is set to its new value; otherwise the change in the param- 
eter value is rejected. We choose n = 8, which is large 



enough to obtain a distribution with significant features 
and for which typically 1000-2000 MC steps are suffi- 
cient for convergence. A larger n greatly increases the 
number of MC steps necessary to converge and also in- 
creases the risk of being trapped in a metastable state 
because the size of the phase space grows exponentially 
with n. To check that this algorithm does not get trapped 
in a metastable state, we started from several different 
initial states and found virtually identical final distribu- 
tions (Fig. The MC-optimized distribution for each 
period is remarkably close to uniform, as shown in this 
figure. 
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FIG. 6: Optimized strength distributions p(x) for 1901-60 
(triangles) and 1961-2005 (circles), together with the opti- 
mal uniform distributions (dashed). For 1961-2005, we also 
show the final distributions starting from j/i's equally spaced 
between j/i =0.1 and j/g = 1 with the distribution p: (a) uni- 
form on [0.1, 1] (open circles), and (b) a symmetric V-shape 
on [0.1, 1] (full circles). 
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FIG. 7: Comparison of the winning fraction W{r) extracted 
from the actual baseball data (symbols) to the model with a 
constant p{x) (dashed lines), and with the optimal log-normal 
distribution p{x) (full lines). 

Although the optimal distributions are visually not 
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uniform, the small difference in the relative errors, the 
closeness of yi and Xmin, and the imperceptible differ- 
ence in the r dependence of W{r) for the uniform and 
optimized strength distributions suggests that a uniform 
team strength distribution on [xmin, 1] describes the game 
data quite well. 

For completeness, we also considered the 
conventionally-used log-normal distribution of team 
strengths ||, 
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(10) 



With the normalization convention of Eq. ([l0|), the av- 
erage team strength is simply x, which can be set to 
any value due to the invariance of pij with respect to 
the transformation x —^ Xx. Hence, the only relevant 
parameter is the width k. Using the same MC optimiza- 
tion procedure described above, we find that a log-normal 
ansatz for the strength distribution with optimal parame- 
ter K gives a visually inferior fit of the winning fraction in 
both periods compared to the uniform strength distribu- 
tion, especially for r close to 1 (see Fig. |^). The relative 
error for the log-normal distribution is also a factor of 
6 and 3 larger, respectively, than for the optimal distri- 
bution in the 1901-60 and 1961-2005 periods. However, 
we do reproduce the feature that the optimal log-normal 
distribution for 1961-2005 is narrower (k = 0.238) than 
that for 1901-60 [n = 0.353), indicating again that base- 
ball is more competitive in the second period than in the 
first. 



III. WINNING AND LOSING STREAK 
STATISTICS 

We now turn to the distribution of consecutive-game 
winning and losing streaks. Namely, what are the prob- 
abilities Wn and Ln to observe a string of n consecu- 
tive wins or n consecutive losses, respectively? Because 
of its emotional appeal, streakiness in a wide variety of 
sports continues to be vigorously researched and debated 
p5|, pO|, 21 1 . In this section, we argue that indepen- 



dent game outcomes that depend only on relative team 
strengths describes the streak data for the period 1961- 
2005 quite well. The agreement is not as good for the 
period 1901-60 and suggests that non-statistical effects 
may have played a role in the longest streaks. 

Historically, the longest team winning streak (with ties 
allowed) in major-league baseball is 26 games, achieved 
by the 1916 New York Giants in the National League 
over a 152-game season ||2^. The record for a pure win- 
ning streak since 1901 (no ties) is 21 games, set by the 
Chicago Cubs in 1935 in a 154-game season, while the 
American League record is a 20-game winning streak by 
the 2002 Oakland Athletics over the now-current 162- 
game season. Conversely, the longest losing streak since 
1901 is 23, achieved by the 1961 Philadelphia Philhes 



in the National League and the American League 
losing-streak record is 21 games, set by the Baltimore 
Orioles at the start the 1988 season. For completeness, 
the list of all winning and all losing streaks of > 15 games 
is given in the appendix. 
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FIG. 8: Distribution of winning/losing streaks P„ versus n 
since 1901 on a semi-logarithmic scale for 1901-60 (A) and 
1961-2005 (•). The dashed curves are the result of simula- 
tions with Xraiu ~ 0.278 and a;inin = 0.435 for the two re- 
spective periods. The smooth curves are streak data from 
randomized win/loss records, and the dotted curve is 2~". 

Fig. ^ shows the distribution of team winning and los- 
ing streaks in major-league baseball since 1901. Because 
these winning and losing streak distributions arc virtu- 
ally identical for n < 15, we consider P„ = {Wn +Ln)/2, 
the probability of a winning or a losing streak of length 
n (Fig. It is revealing to separate the streak distri- 
butions for 1901-60 and 1961-2005. Their distinctness 
is again consistent with the hypothesis that baseball is 
becoming more competitive. In fact, exceptional streaks 
were much more likely between 1901-60 than after 1961. 
Of the 55 streaks of > 15 games, 27 occurred between 
1901-30, 13 between 1931-60, and 15 after 1960 ||. 

The first point about the streak distributions is that 
they decay exponentially with n, for large n. This be- 
havior is a simple consequence of the following bound: 
consider a baseball league that consists of teams with ei- 
ther strengths x = 1 or x = Xmin > 0, and with games 
only between strong and weak teams. Then the distri- 
bution of winning streaks of the strong teams decays as 
(1 -|- a;rnin)~"; this represents an obvious upper bound for 
the streak distribution in a league where team strengths 
are uniformly distributed in [xmin, !]■ 

We now apply the BT model to determine the form 
of the consecutive-game winning and losing streak dis- 
tributions. Using Eq. (^ for the single-game outcome 
probability, the probability that a team of strength x 
has a streak of n consecutive wins is 



Pn{x) = n 



Xo 



Xn+1 
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(11) 
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The product gives the probabiHty for n consecutive wins 
against teams of strengths xj, j = 1,2, ... ,n (some fac- 
tors possibly repeated), while the last two factors give 
the probability that the 0*** and the {n + 1)'^* games are 
losses to terminate the winning streak at n games. As- 
suming a uniform team strength distribution p{x), and 
for the case where each team plays the same number of 
games with every opponent, we average Eq. ( pj] ) over all 
opponents and then over all teams. 
The first average gives: 
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for a uniform distribution of team strengths in [xmin , 1] • 
Here we use the fact that each team strength is indepen- 
dent, so that the product in Eq. ( |ll| ) factorizes. We now 
average over the uniform strength distribution, to find, 
for the team-averaged probability to have a streak of n 
consecutive wins, 



{Pn) = 
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f{x) e"s(^) dx , 



(13) 
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Since g(x) monotonically increases with x within 
[a:^min,l], the integral in Eq. ( p^ ) is dominated by the 
behavior near the maximum of g(x) at x = 1 for large n. 
Performing the integral by parts pq | , the leading behav- 
ior is 



(14) 



with 



g(l) = -ln(l 
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As expected, (P„) decays exponentially with n, but 
with a decay rate that decreases as teams become more 
heterogeneous (decreasing Xmin). In the limit of equal- 
strength teams, the most rapid decay of the streak prob- 
ability arises, F„ ~ 2^", while the widest disparity in 
team strengths, a;niin — 0, leads to the slowest possible 
decay P„ - (In 2)" « (0.693)". 

We simulated the streak distribution P„ using the same 
methodology as that for the win/loss records; related 



simulations of streak statistics are given in Refs. ||l^, pi[ . 
Taking Xmin — 0.435 for 1961-2005 — the same value as 
those used in simulations of the win/loss records — we find 
a good match to the streak data for this period. The ap- 
parent systematic discrepancy between data and theory 
for n > 17 is illusory because streaks do not exist for 
every value of n. Moreover, the number of streaks of 
length n > 17 is only eight, so that fluctuations are quite 
important. 

For the 1901-60 period, if we use Xmin — 0.278, the 
data for P„ is in excellent agreement with theory for 
n < 17. However, for n in the range 17-22, the data is a 
roughly factor of 2 greater than that given by the analyt- 
ical solution Eq. ( p^ ) or by simulations of the BT model. 
Thus the tail of the streak distribution for this early pe- 
riod appears to disagree with a purely statistical model 
of streaks. Again, the number of events for a n > 17 is 5 
or less, compared to a total number of ~ 70000 winning 
and losing streaks during this period. Hence one cannot 
exclude the possibility that the observed discrepancy for 
n > 17 is simply due to lack of statistics. 

Finally, we test for the possible role of self- 
reinforcement on winning and losing streaks. To this end, 
we take each of the 2166 season-by-season win/loss his- 
tories for each team and randomize them 10^ times. For 
each such realization of a randomized history, we com- 
pute the streak distribution and superpose the results 
for all randomized histories. The large amount of data 
gives streak distributions with negligible fluctuations up 
to n = 30 and which extend to n = 44 and 41 for the two 
successive periods. More strikingly, these streak distribu- 
tions based on randomized win/loss records are virtually 
identical to the simulated streak data as well as to the 
numerical integration of Eq. (O), as shown in Fig. @. 



IV. SUMMARY 

To conclude, the Bradley- Terry (BT) competition 
model, in which the outcome of any game depends 
only on the relative strengths of the two competing 
teams, quantitatively accounts for the average win/loss 
records of Major-League baseball teams. The distribu- 
tion of team strengths that gives the best match to these 
win/loss records was found to be quite close to uniform 
over a range [a;ininj 1], with a^min ~ 0.28 for the early mod- 
ern era of 1901-1960 and Xmin ~ 0.44 for the expansion 
era of 1961-2005. This same BT model also reproduces 
the season-to-season fluctuations of the win/loss records. 
An important consequence of the BT model is the ex- 
istence of a non trivial detailed-balance relation which 
we verified with satisfying accuracy. We consider this 
verification as a quite stringent test of the theory. 

The same BT model was also used to account for the 
distribution of team consecutive-game winning and losing 
streaks. We found excellent agreement between the pre- 
diction of the BT model and the streak data for n < 17 
for both the 1901-60 and 1961-2005 periods. However, 
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the tail of the streak distribution for the 1901-60 period 
with n > 17 is less accurately described by the BT the- 
ory and it is an open question about the mechanisms 
for the discrepancy, although it could well originate from 
lack of statistics. We also provided evidence that self- 
reinforcement plays little role in streaks, as randomiza- 
tions of the actual win/loss records produces streak dis- 
tributions that are indistinguishable from the streak data 
except in for the n > 17 tail during the 1901-60 period. 

We also showed that the optimal team strength distri- 
bution is narrower for the period 1961-2005 compared to 
1901-60. This narrowing shows that baseball competi- 
tion is becoming keener so that outliers in team perfor- 
mance over an entire season — as quantified by win/loss 
records and lengths of winning and losing streaks — are 
less likely to occur. 

We close by emphasizing the parsimonious nature 
of our modeling. The only assumed features are the 
Bradley- Terry form Eq. (Q) for the outcome of a single 



game, and the uniform distribution of the winning proba- 
bilities, controlled by the single free parameter Xmin- All 
other model features can then be inferred from the data. 
While we have ignored many aspects of baseball that 
ought to play some role — the strength of a team chang- 
ing during a season due to major trades of players and/or 
injuries, home-field advantage, etc. — the agreement be- 
tween the win fraction data and the streak data with 
the predictions of the Bradley- Terry model are extremely 
good. It will be worthwhile to apply the approaches of 
this paper to other major sports to learn about possible 
universalities and idiosyncracies in the statistical features 
of game outcomes. 
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APPENDIX: TEAM WINNING AND LOSING 
STREAKS 



TABLE IL Losing streaks of n > 15 games since 1901. 



TABLE L Winning strealcs of n > 15 games since 1901. 



n 


year 


team 


26 


1916 


New York Giants (1 tie) 


21 


1935 


Chicago Cubs 


20 


2002 


Oakland Athletics 


19 


1906 


Chicago White Sox (1 tie) 


19 


1947 


New York Yankees 


18 


1904 


New York Giants 


18 


1953 


New York Yankees 


17 


1907 


New York Giants 


17 


1912 


Washington Senators 


17 


1916 


New York Giants 


17 


1931 


Philadelphia Athletics 


16 


1909 


Pittsburgh Pirates 


16 


1912 


New York Giants 


16 


1926 


New York Yankees 


16 


1951 


New York Giants 


16 


1977 


Kansas City Royals 


15 


1903 


Pittsburgh Pirates 


15 


1906 


New York Highlanders 


15 


1913 


Philadelphia Athletics 


15 


1924 


Brooklyn Dodgers 


15 


1936 


Chicago Cubs 


15 


1936 


New York Giants 


15 


1946 


Boston Red Sox 


15 


1960 


New York Yankees 


15 


1991 


Minnesota Twins 


15 


2000 


Atlanta Braves 


15 


2001 


Seattle Mariners 



n 


year 


team 


23 


1961 


Til '1 J 1 1' Til 'IT 

Philadelphia Phillies 


21 


1988 


Baltimore Orioles 


20 


1906 


Boston Americans 


20 


1906 


Philadelphia As 


20 


1916 


Philadelphia As 


20 


1969 


Montreal bxpos (farst year) 


19 


1906 


Boston Beaneaters 


19 


1914 


Cincinnati Reds 


19 


1975 


Detroit Tigers 


19 


2005 


Kansas City Royals 


18 


1920 


Philadelphia As 


18 


1948 


Washington Senators 


18 


1959 


Washington Senators 


17 


1926 


Boston Red Sox 


17 


1962 


NY Mets (hrst year) 


17 


1977 


Atlanta Braves 


16 


1911 


Boston Braves 


16 


1907 


Boston Doves 


16 


1907 


Boston Americans (2 tics) 


16 


1944 


Brooklyn Dodgers (1 made-up game) 


io 


iyuy 


St. Louis Browns 


15 


1911 


Boston Rustlers 


15 


1927 


Boston Braves 


15 


1927 


Boston Red Sox 


15 


1935 


Boston Braves 


15 


1937 


Philadelphia As 


15 


2002 


Tampa Bay 


15 


1972 


Texas Rangers (first year) 



